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Cost  Cumulant-Based  Control  for  a  Class  of  Linear  Quadratic  Tracking  Problems 


Khanh  D.  Pham 
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Air  Force  Research  Laboratory 
Kirtland  AFB,  NM  87117  U.S.A. 


Abstract — The  topic  of  cost-cumulant  control  is  currently 
receiving  substantial  research  from  the  theoretical  community 
oriented  toward  stochastic  control  theory.  For  instance,  the 
present  paper  extends  the  application  of  cost-cumulant  con¬ 
troller  design  to  control  of  a  wide  class  of  linear-quadratic 
tracking  systems  where  output  measurements  of  a  tracker 
follow  as  closely  as  possible  a  desired  trajectory  via  a  com¬ 
plete  statistical  description  of  the  associated  integral-quadratic 
performance-measure.  It  is  shown  that  the  tracking  problem 
can  be  solved  in  two  parts:  one,  a  feedback  control  whose 
optimization  criterion  representing  a  linear  combination  of 
finite  cumulant  indices  of  an  integral-quadratic  performance- 
measure  associated  to  a  linear  tracking  stochastic  system  over  a 
finite  horizon,  is  determined  by  a  set  of  Riccati-type  differential 
equations;  and  two,  an  affine  control  which  takes  into  account 
of  dynamics  mismatched  between  a  desired  trajectory  and 
tracker  states,  is  found  by  solving  an  auxiliary  set  of  differential 
equations  (incorporating  the  desired  trajectory)  backward  from 
a  stable  final  time. 

I.  Preliminaries 

An  interesting  extension  of  the  cost-cumulant  control  the¬ 
ory  [4]-[7]  when  both  perfect  and  noisy  state  measurements 
are  available,  is  to  make  a  linear  stochastic  system  track  as 
closely  as  possible  a  desired  trajectory  via  a  complete  sta¬ 
tistical  description  of  the  associated  finite-horizon  integral- 
quadratic  performance-measure.  To  the  best  knowledge  of 
the  author,  this  theoretical  development  appears  to  be  the 
first  of  its  kind  and  the  optimal  control  problem  being 
considered  herein  is  actually  quite  general,  and  will  en¬ 
able  control  engineer  not  only  to  penalize  for  variations 
in,  as  well  as  for  the  levels  of,  the  state  variables  and 
control  variables,  but  also  to  characterize  the  probabilistic 
distribution  of  the  performance-measure  as  needed  in  post 
controller-design  analysis.  Since  this  problem  formulation 
is  parameterized  both  by  the  number  of  cumulants  and  by 
the  scalar  coefficients  in  the  linear  combination,  it  defines 
a  very  general  Linear-Quadratic-Gaussian  (LQG)  and  Risk 
Sensitive  problem  classes.  The  special  cases  where  only  the 
first  cost  cumulant  is  minimized  and  whereas  a  denumerable 
linear  combination  of  cost  cumulants  is  minimized  are, 
of  course,  the  well  known  minimum-mean  LQG  problem 
and  the  Risk  Sensitive  control  objective,  respectively.  Some 
practical  applications  for  this  theoretical  development  can 
be  found  in  the  references  [2]  and  [3]  where  in  tactical  and 
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combat  situations,  a  vehicle  with  the  goal  seeking  nature  ini¬ 
tially  decides  on  an  appropriate  destination  and  then  moves 
in  an  optimal  fashion  toward  that  destination,  and  tracking 
problems  in  economic  stabilization  policy,  respectively. 
Consider  a  linear  stochastic  tracking  system  governed  by 

dx(t)  =  (A(t)x(t)  +  B{t)u{t))dt  +  G(t)dw(t),  x(to)  (1) 
y(t)  =  C{t)x{t)  (2) 

where  the  deterministic  coefficients  A  G  C([fo, tf]; Kraxn), 
B  g  C([t0,f/];Rnxm),  C  g  C([t0A/];Rrxn).  and  G  g 
C([fo, Q]; Knxp)-  The  system  noise  w(t)  G  is  the 
p-dimensional  stationary  Wiener  process  starting  from  to, 
independent  of  the  known  x(t o)  =  £0,  and  defined  with 
{Tt>()  \  being  its  filtration  on  a  complete  filtered  probability 
space  (O,  T,  {7-j>o},  V)  over  [to,tf]  with  the  correlation 

E  {[w(t)  —  ■u:(^)][w(r)  —  w(£)]T}  =  W\t  —  £|  ,  W  >  0  . 

The  control  input  u  G  L2^  (f2;  C([to,  tf];  Rm))  the  subset 
of  Hilbert  space  of  ]Rm-valued  square-integrable  process  on 
[to,tf]  that  are  adapted  to  the  cr-field  Tt  generated  by  w(t) 
to  the  specified  system  model  is  selected  so  that  the  mea¬ 
surement  output  y  G  Lyr  (f2; C([to ,tf]> Rr))  best  matches 
the  desired  output  z  G  L2([to,  tf]; Rr)  in  the  cost  cumulant 
optimization  criterion  which  will  be  clear  shortly.  Associated 
with  the  initial  condition  (to,Xo;u)  G  [t-oRf]  x  x 
L%t(Q;C([t0,tf];Wn))  is  a  traditional  finite-horizon  IQF 
random  cost  J  :  [to,tf]  x  x  (fl;  C([to,  tf];  R171))  >— > 
K+  such  that 

J(t0,x0;u)  =  [z(tf)  —  y(tf)]TQf  [z(tf)  -  y(tf)\  (3) 

+  /  {[z{T)—y{T)]TQ{T)  [z{t)  -y(r)]  +  uT{T)R(T)u(T)}dT 

J  t0 

in  which  the  terminal  penalty  error  weighting  Qf  G  Krxr, 
the  error  weighting  Q  G  C([io> tf]; Mrxr),  and  the  control 
input  weighting  R  G  C([to,  tf];  Rmxm)  are  deterministic, 
symmetric,  and  positive  semi-definite  with  R(t)  invertible. 

In  view  of  the  linear  system  (l)-(2)  and  the  quadratic 
cost  (3),  it  is  reasonable  to  assume  the  control  input 
being  generated  from  a  class  of  linear-memoryless  state- 
feedback  stategies  7  :  [to,tf]  x  (U;  C([to,  tf);  Mn))  1— > 
Lj-((f2;C([f0,f/];Mm)),  has  a  form  of 

u(t)  =  7 (t,  x(t))  =  K(t)x(t)  +  uz(t) ,  (4) 

where  uz  G  C([fo,  tf]:  Mm)  is  an  additional  control  signal 
which  takes  into  consideration  for  dynamics  mismatched 
between  the  tracking  states  x(t)  and  the  desired  trajectory 


z(t )  on  [to,tf]  and  K  £  C([f0,  tf}\ Kmxn)  is  an  admissible 
feedback  gain  in  a  sense  to  be  specified  later.  Hence,  for  the 
given  initial  condition  ( to,Xo )  £  [to,tf]  x  R”  and  subject  to 
the  control  policy  (4),  the  dynamics  of  the  tracking  problem 
are  governed  by 

dx(t)  =  [ A(t )  +  B(t)K(t)\x(t)dt  +  B(t)uz(t)dt 

+  G(t)dw(t ) ,  x(t0)  =  x0  ,  (5) 

y(t)  =  C(t)x(t) ,  (6) 

and  the  IQF  random  cost 

J  (to,  X0\ K ,  uz)  =  [ z(tf )  -  y(tf)]T  Qf  [ z(tf )  -  y(tf )] 

+  [  {  Ht)  -  y(r)]T  Q(t)  [z(t)  -  y(r)] 

Jt0 

+  [K(t)x(t)  +  uz(t)]tR(t)[K(t)x(t)  +  uz  (r  )]  |<ir.  (7) 

It  is  now  necessary  to  develop  a  procedure  for  generating 
some  cost  cumulants  for  the  tracking  problem.  These  cost 
cumulants  are  then  used  to  form  a  performance  index  in 
the  cost-cumulant  control  optimization.  In  general,  it  is  sug¬ 
gested  that  the  initial  condition  ( to,Xo )  should  be  replaced 
by  any  arbitrary  pair  (a,xa).  Then,  for  the  given  it,  and 
admissible  feedback  gain  K,  the  random  cost  (7)  is  seen  as 
the  “cost-to-go”,  J  (a,xa).  The  moment-generating  function 
of  the  vector-valued  random  process  (5)  is  defined  by 

tp(a,xa;9)  =  E{exp(9J(a,xa))}  ,  (8) 

where  the  scalar  6  £  M+  is  a  small  parameter.  Thus,  the 
cumulant-generating  function  immediately  follows 

ip(a,xa-,0)  =  In  {ip  (a,  xa;0)}  ,  (9) 

in  which  ln{-}  denotes  the  natural  logarithmic  transformation 
of  an  enclosed  entity. 

Theorem  1:  Cost-Cumulant  Generating  Equations. 

For  a  €  [to,tf]  and  9  €  R+,  define  ip(a,xa\9)  = 
g  (a,  9)  exp  |x^T(a,  6)xa  +  2x^y(a,  6) }  and  v(a,9)  = 
ln{g  (a,  #)}.  Then,  the  cost  cumulant-generating  function 
can  be  expressed  as  follows 

ip  (a,  xa\  9)  =  x^T(a,  9)xa  +  2 x^r](a,  9)  +  v  (a,  9)  (10) 

where  T(a,  9),  y(a,9),  and  v(a,9)  solve  the  backward-in- 
time  differential  equations 

-p-T(a,  9)  =  -[A(a)  +  B(a)K(a)]TT(a,9)  (11) 

da 

-T(a,9)[A(a)  +  B(a)K(a)\ 

—  2T (a,  9)G(a)WGT(a)T(a,  9) 

—  9CT (a)Q(a)C(a)  —  9KT (a)R(a)K(a) , 

—y  (a,  9)  =  -[H(a)  +  B(a)K(a)]T  y(a,  9)  (12) 

da 

—  T  (a,9)B(a)uz(a) 

—  9KT (a)R(a)uz(a)  +  9CT (a)Q(a)z(a) , 

-f-v  (a,  9)  =  -Tr{T (a,9)G  (a)WGT  (a)}  (13) 

da 

—  2 yT(a,  9)B(a)uz(a ) 

—  0uJ(a)R(a)uz(a)  —  0zT  (a)Q(a)z(a) 


with  the  terminal  conditions  T (tf,9)  =  9CT (tf)Q fC(tf), 
y(tf,9)  =  9CT(tf)Qfz(tf),  v(tf,9)  =  9zT(tf)Qfz(tf). 

Remark.  The  expression  for  cost  cumulants  (10)  in  the 
tracking  problem  indicates  that  additional  second  and  third 
affine  terms  are  taking  into  account  of  dynamics  mismatched 
in  their  trajectory-governing  equations. 

By  definition,  cost  cumulants  for  the  tracking  problem  can 
be  generated  by  employing  the  MacLaurin  series  expansion 
for  the  cumulant-generating  function 


OO 

ip  (a,  xa\  9)  =  ^  Ki(a,  xa) 


=  £ 


gG) 

Q0j^(a,xa-,9) 


8=0 


(14) 


in  which  Ki(a,xa)  are  called  cost  cumulants.  Furthermore, 
the  series  coefficients  of  the  expansion  are  computed  by  (10) 


<9(0 

dW) 


ip(a,x0\9) 


=  X 


?=0 


T 


„  g(  0 


8=0 


0W 


8=0 


’(a,  9) 


(15) 


8=0 


In  view  of  the  results  (14)  and  (15),  cost  cumulants  for  the 
tracking  problem  are  obtained  as  follows 


Ki(a,xa)  =  xl 


gA) 

dW) 


T (a,  9) 


8=0 


2  xz 


gd) 

dW) 


y(a,9) 


8=0 


gd) 


’(a,  9) 


(16) 


8=0 


for  any  finite  1  <  i  <  oc.  For  notational  convenience,  it  is 


necessary  to  denote  H(a,i)  =  ^]yT(a,6,)| 

pr V(a,9)  and  D(a,i)  =  j^v(a,9)  | 

Theorem  2:  Cost  Cumulants  in  Tracking  Problems. 

The  tracker  dynamics  governed  by  (5)-(6)  attempt  to  track 
the  desired  trajectory  z(t)  with  the  IQF  cost  (7).  For  k  £  Z+ 
fixed,  the  fcth-cost  cumulant  of  the  Chi-square  type  random 
cost  (7)  is  given  by 

nk(to,Xo\K,uz)  =  XoH(t0,k)x0 

+  2xq  D(t0,k)  +  D(t0,k) ,  (17) 


8=0 


,  D(a,  i )  = 


where  {iT(a,i)}^=1,  {D(a,T)}f_v  and  {D(a,i)}\ L1  eval¬ 
uated  at  a  =  to  satisfy  the  matrix-  and  vector-valued  differ¬ 
ential  equations  (with  the  dependence  of  H(a,i),  D(a,i), 
and  D(a,  i)  upon  uz  and  K  suppressed) 

—H(a,  1)  =  —  [ A(a )  +  B(a)K(a)]T  H(a,  1) 

da 

—  H(a,  1)  [ A(a )  +  B(a)K(a)\ 

-  CT (a)Q(a)C(a)  -  I<T (a) R(a) K (a) ,  (18) 

—H(a,  i )  =  —  \A(a)  +  B(a)K(a)]T  H(a,  z) 

da 

—  H(a,  i )  \A(a)  +  B(a)K(a)\ 

-  £  '  .,.H(a,j)G(a)WGT(a)H(a,i  -  j) ,  (19) 

fA  rv-j)' 


together  with 

4-D{a,  1)  =  -  [A(a)  +  B(a)K(a)]T  D{a,  1)  (20) 

da 

—  H(a ,  1  )B(a)uz(a) 

—  Kt (a)R(a)uz(a)  +  CT (a)Q(a)z(a) , 

—D(a,  i)  =  —  [A(a)  +  B(a)K(a)]T  D(a,  i) 
da 

—  H(a,i)B(a)uz(a) ,  2  <i<k  (21) 

and 

1)  =  -Tr  { H{a ,  1  )G(a)WGT(a)}  (22) 

—  2DT(a ,  1  )B(a)uz(a) 

—  uJ(a)R(a)uz( a)  —  zT [a)Q{a)z(a) , 

-Y~D(a,i)  =  — Tr \H(a,i)G(a)WGT(a)\ 
da 

—  2  DT  (a,i)B(a)uz(a) ,  2  <i<k  (23) 

where  the  terminal  conditions  =  CT(tf)QfC(tf), 

)  =  0  for  2  <  i  <  k;  D(tf,  1)  =  — CT(tf)Qfz(tf ), 
D(tf,i )  =  0  for  2  <  i  <  k  and  D(tf,  1)  =  zT (tf)Q fz(tf), 
D[tf ,  i)  =  0  for  2  <  i  <  k. 

II.  Problem  Statements 

In  preparing  for  the  control  statements  of  the  tracking 
problem,  let  fc-tuple  variables  Ti,  V,  and  V  be  defined  as 

H(-)  4  («!(■),...,«*(■)),  #(■)  =  ■■•,£*(■)). 

V(-)  =  (X^i X>fe (j))  for  each  element  Tii  £ 

C1([f0,f/];Knxn)  of  H,  T>i  £  C\[t0,tf];Rn)  of  V  and 
T>i  £  C1([foi  tf];  K)  of  V  having  the  representations  7 iz(-)  = 
H(-,i ),  Pj(-)  =  and  Dj(-)  =  D(-,i)  with  the 

right  members  satisfying  the  dynamic  equations  (18)-(23) 
on  the  horizon  [to,tf]-  The  problem  formulation  is  greatly 
simplified  if  the  convenient  mappings  are  introduced 

Ti  :  [t0,tf]  x  (R"x”)fc  x  Kmx"  i — ►  Fxn 

Qi  :  [to,tf]  x  (Knxn)fc  x  (K")fc  x  Rmxn  xThR" 

Qi  :  [to,tf]  x  {Rnxn)k  x  (R")fc  xr^R 

where  the  actions  are  given  by 

Tx{a,U,K)  4  -  [A{a)  +  B{a)K(a)]T  Th{a) 

—  Ti\{a)  [. A(a )  +  B{a)K{a)] 

—  CT  (a)Q(a)C(a)  —  KT(a)R(a)K(a) , 

Tfia,  H,K)  =  -  [A{a)  +  B{a)K{a)]T  Hi(a) 

—  Tifia)  [. A(a )  +  i?(a)/T(o;)] 
oil 

-  Xj  )!(/  _  ■y  '^Aa)G(a)WGT (a)Hi-j(a) , 

0r  (a,  H,  V,  K,  uz )  4  _  [A(a)  +  B{a)K{a)f  Vx  (a) 

— 7fi(a)f3(a)uz(a)— /vT(a)i?(a)  u2(a)+C'T(a)Q(a)2:(a) , 
&  (a,  7T,  K,  uz)  =  -  [A(a)  +  B{a)K{a)f  Vfia) 

-  ' Hi(a)B(a)uz(a ) , 


(i a,H,V,uz )  4  — Tr  {7Y1(a)G(a)M/G'T(a)} 

—2T>i  {a)B{a)uz{a)—u^ (a)R{a)uz(a)— zT (a)Q(a)z(a) , 

Qi  ( 'a,H,V,uz )  4  -Tr{Wi(a)G(a)WGT(a)} 

—  22?f  (a)B(a)wz(a) . 

Now  there  is  no  difficulty  to  establish  the  product  mappings 

TiX--xTfe  x  (rxn)i:xlmxnh4  (R”x")fc 

0ix-  •  -x  0fc  :  [t0,  tf]  x  (Knxn)fcx  (F)fc  x  Rmxnx  Km  ^  (Rn)k 
Qi  x  •  •  •  x  Qk  :[t0,tf]  x  (K"x")fe  x  (F)fcxKm  i->  Rfc 

along  with  the  corresponding  notations  T  =  iFi  x  •  ■  •  x  fj,., 
0  =  0i  x  •  •  •  x  0^,  and  0  =  0!  x  •  •  •  x  0^..  Thus,  the  dynamic 
equations  of  motion  (18)-(23)  can  be  rewritten  as 

4-H(a)=F(a,  H(a),K(a)),  H(tf)  =  Hf, 
da 

-^'D(a)  =  Q(a,  H(a),  T>(a),  K(a),  uz(ajj ,  T>(tf)=T>f, 

^-V{a)=Q  (a,H(a),V(a),uz(a)^  ,  T>(tf)  =  Vf 

where  /c-tuple  values  Hf  =  {CT  (tf)QfC(tf),  0,...,0), 
Vf  =  (-CT(tf)Qfz(tf),  0, . . . ,  0),  and  Vf  =  (0, . . . ,  0).  ^ 
Note  that  the  product  system  uniquely  determines  Ti,  V 
and  V  once  the  admissible  affine  control  uz  and  feedback 
gain  K  are  specified.  Hence,  they  are  considered  as  Ti  = 
H(-,K),  V  =  V(-,K,uz),  and  V  =  V(-,K,uz).  The 
performance  index  in  the  cost-cumulant  control  problem  can 
now  be  formulated  in  uz  and  K . 

Definition  1:  Performance  Index. 

Fix  k  £  Z+  and  the  sequence  /i  =  {/ij  >  0}*=1  with  /ii  >  0. 
Then,  for  the  given  (to,£o),  the  performance  index 

:  [to,tf]  x  (Rnxn)k  x  ( Rn)k  x  Rk  ^  M+ 

in  cost-cumulant  control  for  the  tracking  problem  is  defined 
as  follows 


fitk  (to,W(t0),X>(to),X>(fo)) 


k 

xlHi{t0)xo  +  2xo'Di(to)  +  Vfito) 

i= 1 


(24) 


where  the  scalar,  real  constants  /j,  represent  parametric 
design  freedom  and  levels  of  influence  on  the  overall  cost 

distribution.  The  solutions  {Tii(to)  >  0}^_1,  j 

and  {'D,;(fo)}'c_i  evaluated  at  a  =  to  satisfy  the  dynamic 
equations  of  motion 

^W(a)  =  R(a,  Ti(a),K(a)),  Ti(tf)  =  Hf, 

^P(a)  =  Q(a,'H(a),V(a), K(a),uz(ajj ,  V(tf)  =  T>f, 

^P(a)  =  0  (a,'H(a),V{a),uz{a)SJ  ,  V(tf)  =  Vf  . 

Definition  2:  Affine  Control  and  Feedback  Gains. 

Let  compact  subsets  U  C  Rm  and  K  C  Rmxra  be  the  sets  of 
allowable  affine  inputs  and  gain  values.  For  the  given  k  £  Z+ 


and  the  sequence  h  =  {/a  >  0}*=1  with  Hi  >  0,  the  set  of 
admissible  affine  controls  Utf  n  ^  ^  and  feedback  gains 
JCt  are  respectively  assumed  to  be  the  classes  of 

C([t0, f/]; Rm)  and  C([fo,f/];Kmxn)  with  values  uz(-)  £  U 
and  K(-)  £  I\  for  which  solutions  to  the  dynamic  equations 
with  H(tf)  =  Uf,  V(tf)  =  Vf,  and  V(tf)  =  Vf 


4-n(a)  =  %  H(a),K(a)) ,  (25) 

da 

^V(a)  =Q  (a,H(a),V(a),K(a),uz(af)  ,  (26) 

-^V{a)=g(a,n{a),V{a),uz{a))  (27) 


exist  on  the  interval  of  optimization  [to,tf]. 

Definition  3:  Optimization  Problem. 

Suppose  that  k  £  Z+  and  the  sequence  fi  =  {/ij  >  0},f=1 
with  /ii  >  0  are  fixed.  Then  the  control  optimization 
problem  is  defined  as  the  minimization  of  (24)  over  uz(-)  £ 

K(')  e  and  subJect  t0  the 

dynamic  equations  of  motion  (25)-(27)  for  a  £  [to,tf]. 
Definition  4:  Reachable  Set. 

Let  reachable  set  Q  be  defined  Q  =  j  (e,y,Z,Z^j  £ 


[t0,tf]  x 

and  1C 


j>nxn\k 


)k  X 
7^  0. 


j>n\k 


such  that  U 


s,y,Z,Z-,n 


7^0 


•'S,y,z,z-,n 

By  adapting  to  the  initial  cost  problem  and  the  terminolo¬ 
gies  present  in  cost-cumulant  control,  the  Hamilton-Jacobi- 
Bellman  (HJB)  equation  satisfied  by  the  value  function  is 
motivated  by  the  excellent  treatment  [1]  and  is  given  below. 

Theorem  3:  HJB  Equation-Mayer  Problem. 

Let  y,  Z,  Z^j  be  any  interior  point  of  the  reachable  set  Q 


at  which  the  value  function  V 


(e.J7, 


Z,Z 


is  differentiable. 

and 


If  there  exist  optimal  affine  control  u*  £  Ue  y  g  z 
feedback  gain  K*  £  1C£  y  %  z  ,  then  the  partial  differential 
equation  of  dynamic  programming 


0  =  min  _  <  —  V 

u,EU,KeK  os 


fy.z.f 


d 


dvec(y) 
d 


v(e,y,Z,z)ve c  (F(e,y,K)) 


d  vec 


J^V  (e,  y,  Z,  z)  vec  (g  (e,  y,  Z,  K,  ua)) 

\Z) 


+  o^kz)v(c'y’i’z)™(s(c'y’i’u-))  \  <28) 


is  satisfied  together  with  the  boundary  value  condition 
V  (^0,14.0, ’DojVoj  =  fitk  (to,  'Ho,  An  "Doj . 

Theorem  4:  Verification  Theorem. 

Fix  k  £  Z+  and  let  w(s,y,Z,Z^  be  a  continuously 
differentiable  solution  of  the  HJB  equation  (28)  which 
satisfies  the  boundary  condition  W  (t.o,Ho,Do,Do^  = 

<t>tk  (to,  Wo,A),2?o)-  Let  be  in  Q; 

(uz,K)  in  Llts,us,i>Slvs-,ii  x  ^tf ,nf ,v f ,x> ^ 

and  V  the  corresponding  solutions  of  (25)-(27).  Then 


W(a,'H(a),‘D(a),D(a))  is  a  non-increasing  function  of  a. 
If  ( u*,K *)  is  in  x  ^ -,(i  defined 

on  with  corresponding  solutions,  Tt*,  D* ,  and  V*  of 

(25)-(27)  such  that  for  a  £  [to,tf] 

0  =  A W  (a,H*{a),V*{ot),V*{a)) 

+  d^ky)w(a'H*{a)^*{a)’v*{a))' 

•vec  {T  (a,H*  (a),  K*  (a))) 

+  W  (a,H*(a),D*(a),V*(a))  • 

•vec  (g  (a,n*(a),V*(a),K*(a),u*(a)^ 

+  „  ^  W  (a,H*(a),D*(a),  Z>»)  ■ 
avec(Z)  V  / 

•  vec  (g  (a,  H*  (a),  V*  (a)  ,<(«)))  ,  (29) 
then  u*z  and  K*  are  optimal.  Moreover, 

w(s,y,z,z)  =  v(e,y,Z,Z^j  (30) 

where  V  (s,y,Z,zJ  is  the  value  function. 

III.  Optimal  Solution  of  kCC  Control 

The  treatment  of  HJB  approach  to  obtaining  a  solution  to 
the  cost-cumulant  control  problem  requires  to  parameterize 
the  terminal  time  and  states  of  the  dynamical  equations 
as  (e,y,Z,Z^j  rather  than  (t f ,TL j ,V f .  That  is,  for 
s  £  [to,tf]  and  1  <  i  <  k,  the  states  of  the  system  (25)- 
(27)  defined  on  the  interval  [£q,£]  have  the  terminal  values 
denoted  by  TL(e)  =  y,  D(s)  =  Z,  and  T>(e)  =  Z.  It  is 
observed  that  the  performance  index  (24)  is  quadratic  affine 
in  terms  of  the  arbitrarily  fixed  as'y.  This  suggests  a  solution 
to  the  HJB  equation  (28)  may  be  sought  as  below. 

Theorem  5:  Fix  k  £  Z+  and  let  (s,y,Z,Z^J  be  any 
interior  point  of  Q  at  which  the  scalar-valued  function 
k 

W  (e,  y,  Z,  z)  =  xl  Y,  ^  W  +  £i(e))  xo 

1=1 

k  k 

+  p.;  (Zi  +  Hi  (Zi  +  %{s))  (31) 

i- 1  i— 1 

is  differentiable.  The  time-varying  parametric  functions 
Si  G  C1([t0,i/];Knxn),  ?i  G  and  %  £ 

C1([£o,  £/];  R)  are  yet  to  be  determined.  The  derivative  of 
W  (s,y,  Z.  z')  with  respect  to  e  is  given 


k 

^  w(e,  y,  z,  z)  =  xl  Y  MI  fe(e,  y,  K)  +  j-£.l{e\ 

i= 1  x  x 

k  /  d  \ 

+  2ccq  Y  Mi  (  Qi  y,  Z,  K1  u^j  +  ) 

i=  1  ^  ' 

k  / 

+  y^Mi  (Si  ( s,y,z , 


x0 


+  —7;(e)  )  ,  (32) 


provided  uz  G  U  and  K  E  K. 

Replacing  (31)  into  the  HJB  equation  (28),  it  follows  that 

k 

'  /  ^  .  CL  ^ 

Xo 


0  =  min  <  x'o 


uzeu,KeK 
k 


f  d  \ 

[Fi(s,y,K)  +  -£i(e) ) 

Z=1  '  ' 


2x, 


E>  (&  (e,y,Z,K,ux)  + 


+ 

)-i 

Note  that 


/  d 

Em*  (& 


E  ^(e,  y,  AO  =  -  [A(s)  +  B(s)K]t  J2  Mitt 


i=  1 


i—  1 


-  E  fityt  [A(e)  +  B(e)K }  -  hiCt (e)Q(e)C(e) 


i= 1 


k  i—  1 


2! 


-  imktr(e)k  -Y^Y  ,E: 


*=2  j=l 


iK*- i)! 


ft; 

E  Mi&  (e,  y,  2,  K, «,)  =  -  [A(e)  +  A(e)A']T  E  m4* 
2=1  2=1 
k 

~  E  RyiB(£)Uz  ~  H iKTR(e)uz  +  ihCt{£)Q(s)z(£), 

i= 1 

k  k 

Ewft  (e,tt-Z,tiz)  =  -E^Tr{^G(£)WGT(£)} 
2=1  2  =  1 

k 

-  2E iMZfB{e)uz-niu^R{£)uz  -  /iizT(£)Q(£)z(£). 

j=i 

Since  .x'o  and  Mo  are  arbitrary  vector  and  rank-one  matrix, 
the  necessary  condition  for  an  extremum  of  (24)  on  [to ,  e]  is 
obtained  by  differentiating  (33)  with  respect  to  uz  and  K 


E  /A-t-£*(e)  -  -4T(e)  E  “  E  M*tt^(e) 

.2=1  £  2=1  2=1 

-  ^CT{£)Q{£)C{£) 

k  k 

+  E  nryrB{£)R-i{£)BT{£)  e  ^tt 


r=l 

k 


2=1 

fc 


+  E^(£)^(£)A  1(e)AT(£)  Efc^s 

2=1  S=1 

k  k 

-  Hi  E  HryrB{£)R~1(£)BT(£)  E  /A>tt> 


-  £  ft  £  A°WiroT 


x0 


+2x0 


(33) 


^  fii—fi{£)-AT{£)  ^  /^H-ZiiC^e)^)^) 
2=1  2=1 

+  E  R>yrB(£)R~1{£)BT(£)  E  /A-Z* 

r=l  2=1 

+  E  ^yiB{£)R~1(£)BT{£)  E  Mr-Zr 


2=1 

k 


r=  1 


-  HI  E  MryrB(e)iZ’1(e)BT(e)  E  M«2S 


+  E  Mi  ^00  -  E  MiTr  {^:G(£)W/GT(e)} 


+2E  HiZfB(e)R  1{£)Bt{£)  E  HrZr-HlZT{£)Q{e)z{e) 

2=1  r=l 

fe  k 

-  Hi  E  ^rZr  B(£)R-\£)BT(£)  Vs? s  ■  (36) 


r=l 


s= 1 


The  remaining  task  is  to  display  time-dependent  functions 

{£»(■)}*=  i>  {^(■)}._1>  and  {^i(‘)}i=i>  which  yield  a  suf¬ 
ficient  condition  to  have  the  left-hand  side  of  (36)  being 

zero  for  any  £  £  [fo,f/],  when  {34}f=i  and  j-2^)  are 
evaluated  along  solutions  to  the  cumulant-generating  etjua- 
tions.  A  careful  observation  of  (36)  suggests  that  {£*(■)  }i= v 

{^(•)}.=i  and  {7)(-)}t  can  be  chosen  to  satisfy  cer¬ 
tain  differential  equations  whose  explicit  representations  are 
omitted  herein  due  to  the  space  limitation.  The  affine  control 
and  feedback  gain  specified  in  (34)  and  (35)  are  now  applied 
along  the  solution  trajectories  of  the  equations  (25)-(27) 


Uz(£,  Z)  =  -R~1(£)BT(£)  E  KZr  ,  (34) 

r= 1 
k 

K{£,  y)  =  -R~1{£)Bt{£)  E  Mrtt ,  (35) 


d£ 


Wi(£)  =  -AT{e)Hi{e)  -  Hi{e)A(e)  -  CT(e)Q(£)C(e) 


where  Hr  =  Hi/ Vi  and  Hi  >  0-  Substituting  (34)  and  (35) 
into  (33)  leads  to  the  value  function 


+  Hi{£)B(£)R-\£)Bt(£)  E  H,H,(e) 

S—  1 

k 

+  E  Hr'Hr{£)B(£)R~1(£)BT (£)Hl(£) 


r—1 

k 


-  E  Hrnr{£)B(£)R-\£)BT{£)  E  HsH,(e) ,  (37) 


d£ 


Ui{£)  =  -AT(e)'Hi(e)  -  H,(£)A(£) 


+  Hi(£)B(£)R  1(£)BT{£)  E  HsRs^) 

8=1 

k 

+  E  Hr'Hr{£)B(£)R~1(£)BT (£)Hi(£) 

r=  1 

1-1  2?1 

-  E  Jl{i  1  j)]nj{£)G{£)WGT{£)'Hi-j{£) , 


(38) 


-t-Z>i(£)  =  ~AT(e)V1(£)  +  CT{£)Q{e)z{£) 
as 

k 

+  VrHr{£)B{£)R~1{£)BT  {£)!>!(£) 

r—1 

k 

+  Hi(£)B(£)R~1(£)BT (e)  ^2  fir'Dr(£) 

r—1 

k  k 

-  J2  rHr{£)B{£)R~1{£)BT{£ )  ^  HsVs{e) ,  (39) 

r—1  s= 1 

d  fc 

^-A(e)  =  J2  trHr(£)B(£)R-1(£)BT(£)f>i(£)  (40) 

r—1 

k 

-  AT(£)T>i(£)  +Hj{£)B(£)R~1  (g) BT (g)  y;  fir'Dr (e) , 

r=l 

-^X>i(e)  =  — Tr  {7Yi(£)G(e)W/GT(e)}  -  2T(e)Q(£)2:(e) 

k 

+  2£>f  (£)B(e)i?-1(e)BT(£)  ^  Mr^r(£) 

r=l 

-  ^  j5r^r (£)B(£)R~1(£)BT(£)  ^2  %Bs(£)  ,  (41) 

r—1  s—1 

^2?i(e)  =  -Tr{H4(e)G(£)^GT(£)} 

k 

+  2Vj(£)B(£)R-1(£)BT(£)  J2firBr(e)  (42) 

r=l 

where  the  terminal  conditions  Ti\{tf)  =  CT (tf)Q fC(tf), 

' Hi(tf )  =  0  for  2  <  i  <  k;  V\ (tf)  =  — CT(tf)Qfz(tf ), 
Vi(tf)  =  0  for  2  <  i  <  k  and  T>i(tf)  =  zT(tf)Qfz(tf), 
=  0  for  2  <  i  <  k.  The  boundary  condition  of 
W(£,y,Z,Z)  implies  that 

k 

xo  E  Ai  *°  +  ^(*o)) 

2—  1 

k  k 

+  fii  (i>io  +  M2  (^io  + 

i— 1  i—1 

fc  k  k 

—  *0  ^  ^  “1“  ^  ^  “1“  ^  ^  • 
2=1  2=1  2=1 

Therefore,  the  extremizing  affine  control  (34)  and  state- 
feedback  gain  (35)  minimizing  (24)  become  optimal 

k 

<{£)  =  -R-\e)BT{£)Yj^lrV*r{e) , 

r—1 

k 

K*{£)  =  -R~1(£)BT(£)^2llrn*(£). 

r—1 

Theorem  6:  Cost-Cumulant  Control  Solution. 

The  tracker  dynamics  governed  by  (5)-(6)  attempt  to  track 
the  desired  trajectory  z(t)  with  the  Chi-square  random  cost 
(7).  Assume  both  k  £  Z+  and  the  sequence  h  =  {h,  >  0}^=1 
with  hi  >  0  are  fixed.  Then,  the  control  solution  for  the 


multi-cumulant  tracking  problem  is  implemented  by 

u*  (t)  =  K*  (t) x*  (t )  +  u*  (t ) ,  (43) 

k 

K*{a )  =  -R-1(a)BT{a)J2HrK(a) .  (44) 

r=l 

k 

u*(a)  =  -R-1(a)BT{a)J2TrK(a) ,  (45) 

r=l 

where  Hr  =  Hill1 1  represent  different  levels  of  influence 
as  they  deem  important  to  the  overall  cost  distribution  and 

{n*r{a)}kr= 1,  and  {©;(<*) are  the  solutions  of  the 
backward-in-time  Riccati-type  matrix  differential  equations 

Tt\(a )  =  -  [A{a)  +  B(a)K*(a)}T  HUa)  (46) 

da 

-  H*i{a)  [A(a)  +  B(a)K*  (a)] 

-  CT(a)Q(a)C(a )  -  K*T  (a)R(a)K*(a ) , 

4- K*r(a )  =  -  \A(a)  +  B(a)K*(a)]T  H*Ja)  (47) 

da 

-  TL*  (a)  [A(a)  +  B(a)K*(a)] 
fzi  or\ 

-  E  M-  -  eVK(^)G(a)WGT(a)K_s(a), 

SAT  S). 

s—1 

and  the  auxiliary  backward-in-time  vector-valued  differential 
equations 

(a)  =  -  [A{a)  +  B(a)K*  (a)]T  V\  (a)  (48) 

da 

—  TLi{a)B{a)u*z{(S) 

—  K*T (a)R(a)u*z(a)  +  CT (a)Q(a)z(a) , 

4~V*r{a)  =  -  [A{a)  +  B{a)K*{a)]T  V*r{a) 
da 

- nr{a)B(a)u*z{a )  (49) 

with  the  terminal  boundary  conditions  7fj  (tf)  = 
CT(tf)QfC(tf),  =  0  for  2  <  r  <  k  and 

V* (tf)  =  - CT{tf)Qfz{tf ),  V*{tf)  =  0  for  2  <  r  <  k. 
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